Excel and Data Analytics

IMPORTANT -- Required Excel Add Ins

In order to complete all of the tasks in this course, you will need to have two Excel Add In packages:

  • Analysis ToolPak Add-in

  • Solver Add-in

  • Power Pivot Add-in

To add these two packages to your copy of Excel do the following:

  1. Click the File Menu Item

  2. Click on the Options Menu Item (at the bottom)

  3. Click on Add-ins Menu Item

  4. Click on Analysis ToolPak and Solver Add-in

  5. Click on the Go button

  6. Be sure that Analysis ToolPak and Solver Add-in are checked in the popup. Click OK.

Data analytics is a wonderful, fresh, exciting field of endeavor. It is what sport fans do for pleasure as they gather stats, play what-if games with fantasy teams, and predict future ….

Analytics is a conglomeration of many other fields including data wrangling, reporting, descriptive and inferential statistics, forecasting, and predictive methods.

Excel is the Mulit-tool of BI

Can you do analytics in Excel? Yes! It is fascinating that despite the many sophisticated data analysis tools, the most used tool is still the ubiquitous and rather long-in-the-tooth tool Excel. In survey after survey, Excel is the most pervasively used data analytics tool, even by senior analysts. The reasons for its widespread use perhaps center on its familiarity and ease of use.

Excel is the first piece of software students learn with the central purpose of analyzing data. It quickly becomes the multi-tool students turn to first to perform any kind of data manipulation. Excel also has the advantage of letting an analyst arrange the layout and results in ways that make sense in the particular situation. You have almost total control.

Excel is an ideal platform for working through an analytics problem, it give you the tools you need, lets you see the intermediate steps (because you are involved in every step of the process), and finally yields the results. Even though senior analysts have refined tools like R, Python, and Tableau, they still often turn to Excel for data exploration and visualization, data preparation, and preliminary modeling.

Leatherman Multi-tool

Some of the various kinds of analyses that can be performed in Excel:

  • What-if Analysis

  • Sensitivity Analysis

  • Scenario Analysis

  • Breakeven Analysis

  • Multi-dimensional Analysis

  • Trend Analysis

  • Optimization (Prescriptive Analysis)

  • Predictive Analytics

  • Forecasting

  • Data Preparation

  • Regression Analysis

  • Decision Analysis

  • Simulation

We will be doing these analyses throughout this course.

The purpose of this book is to increase your skill at using Excel as a data analytics tool. However, it is important to note that Excel is not better than R, Python, Tableau or other specialized tools. Those tools are often faster, more elegant, and can work on larger data sets. A business intelligence analyst’s main tools for dashboards are Tableau, Tibco, and PowerBI among several others. A data scientist’s main tools are SQL, Python, and R.1

On the other hand, how useful is it to grab some data with your multi-tool (Excel), explore it, clean it up, visualize it, and analyze it. Then, after you have an idea of what analyses you want to do you can use a specialized tool like R, or after you have an idea of the dashboard or visualization you want to make you can create the beautiful version using Tableau or PowerBI. Excel, even though not the optimal tool in every situation, can be, and is, used in almost all kinds of analytics.

Business Intelligence and Analytics

The definitions of Business Intelligence (BI) and analytics are many and varied. To some, business intelligence is the all-encompassing term that combines business analytics, data mining, data visualization, and infrastructure to help organizations make data-driven decisions.2 Thomas Davenport3, author of many articles and books on analytics in business, argues that business intelligence should be divided into querying, reporting, online analytical processing, an "alerts" tool, and business analytics.4

Business Intelligence considers historical, current and predictive views of an organization and its performance in the economic environment in which it exists. Business intelligence is used to make business decisions concerning both short-range operational choices and long-range strategic directions. For our consideration in this course, we can describe BI as computer applications that change raw data into significant, meaningful information to help organizations make better decisions. We define data as raw, unorganized facts, but information is processed data that has been analyzed and organized so that it has meaning and is useful.

Business analytics is the subset of BI focusing on prediction, and optimization, rather than reporting. Thus, we see the modern division between reporting and more advanced analytics.

Every business is different and the business environment for every type of business is also different. An important element of data analysis is the evaluation of how the company is performing in its business environment. Perhaps the primary purpose of data analysis is to be able to understand in-depth the health and future of the organization. One important element of this analysis is to define and calculate Key Performance Indicators (KPI). A key performance indicator is a type of performance measurement that accurately indicates how an organization is performing. A good PKI is like a gage to indicate if the business is on the right track for a successful future. Usually it requires multiple KPIs to accurately measure the health and condition of an organization. And the best KPIs are not always the simple calculations of revenue or profit. A good set of KPIs in an organization must have the following characteristics:

  • Be well-defined and quantifiable

  • Be approprate for the industry and Line of Business

  • Be well communicated and understood throughout the organization

  • Actually be important for achieving the business's objectives

As mentioned earlier, Microsoft Excel is the most widely used tool to help identify and and then use the Key Performance Indicators to make well considered strategic decisions.

Structure of the Book

This text is structured using the well-known analytic hierarchy created by SAS and Gartner, and embellished by many others over the years. This continuum places reporting and analytics activities between the axes of degree on intelligence (as we move from raw data to information to intelligence) and competitive advantage.

Figure 1.1: Business Intelligence and Competitive Advantage
  • As we move upward and to the right we see note that standard reports answer the business question “What happened?”

  • Ad hoc reports help answer an original new question and OLAP (multidimensional analysis) help us drill-down, or slice and dice, to specifics.

  • Alerts and dashboards help an organization see if they are on track to meet important KPIs and if action is needed.

  • Descriptive statistical analysis and hypothesis testing help us pin down why something is happening. Forecasting help us see future trends.

  • Predictive modeling help us classify things like which customers are likely to buy and which are not, or which transactions are likely to be fraudulent and which are not.

  • Finally, optimization helps us choose the best possible solution to a business problem.

In the chart below, we have mapped the book chapters to the analytic continuum.

Figure 1.2: Mapping Lessons to the continuum

This book uses the functionality of Excel in every area of reporting and analytics.

The first part of this text covers the use of Excel in Business Intelligence Reporting activities such as using reporting functions, multi-dimensional analysis, advanced visualization, conditional formatting, dashboards, and some data handling.

The second part of this text covers forecasting through both linear regression and time-series methods.

The third part of this text covers optimization in several situations: decision in low uncertainty using linear, non-linear, and evolutionary optimization; and decisions in high uncertainty using simulation along with optimization.

The fourth part of this text covers some exploratory and predictive analytic methods of Data Science including clustering, market-basket analysis, and text classification among others.